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DETAILED ACTION 

1. This Office action for US Patent Application 10/743,722 is responsive to 
communications filed 05 May 2008, in reply to the Non-Final Rejection of 04 February 
2008. Currently, claims 1-8, 10-24, 29, and 31-33 are pending. 

2. In the previous Office action, claims 1, 5, 10-13, 18-22, 29, and 32-33 were 
rejected under 35 U.S.C. 103(a) as obvious over "Temporally Adaptive Interpolation 
Exploiting Temporal Masking in Visual Perception" (Lee et al.) in view of US Patent 
Application Publication 2003/0142748 A1 (Tourapis et al.). Claims 2, 6-8, and 17 were 
rejected under 35 U.S.C. 103(a) as obvious over Lee et al. in view of Tourapis et al. and 
"Scene-Context Dependent Reference Frame Placement for MPEG Video Coding" (Lan 
et al.). Claims 3, 4, 14, and 23 were rejected under 35 U.S.C. 103(a) as obvious over 
Lee et al. in view of Tourapis et al. and US Patent Application Publication 2002/0146071 
A1 (Liu et al.). Claims 15 and 24 were rejected under 35 U.S.C. 103(a) as obvious over 
Lee et al. in view of Tourapis et al. and "MPEG Video Compression Standard" 
(Mitchell). Claim 16 was rejected under 35 U.S.C. 103(a) as obvious over Lee et al. in 
view of Tourapis et al. and "Digitale Bildcodierung" (Ohm). Claim 31 was rejected under 
35 U.S.C. 103(a) as obvious over Tourapis et al. in view of "Video Indexing Using 
MPEG Motion Compensation Vectors" (Ardizzone et al.). 
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Response to Arguments 

3. Applicant's arguments filed 05 May 2008 have been fully considered but they are 
not persuasive. Applicant states that the prior art does not teach the claimed limitation 
in claim 10 of coding a picture as a B picture in case of consistent motion speed, with 
the Tourapis reference merely coding B frames (determined in another process) in a 
direct mode "assuming that speed is constant". It is respectfully submitted that 
Applicant mischaracterized the combination made between Lee et al. and Tourapis et 
al. While it is true that Lee et al. does not teach assigning B pictures "if the motion 
speeds are consistent with each other", Lee does teach assigning B pictures based on a 
consistency measure between pictures, with frames found to be within a single 
"temporal segment" with a small change in pictures encoded as B frames (p. 515: 
columns 1-2). Tourapis et al. was not relied on to teach "coding the respective picture 
as a B picture", but only that constant motion speed is a known measure of picture 
consistency, and suitable in the Lee et al. reference to determine the boundaries of a 
temporal segment rather than motion compensation error, a great difference in frames, 
or an accumulation of small differences over several frames. In addition, Tourapis et al. 
discloses a Direct prediction decision module 1208 (paragraphs 0098-0099), which 
determines when to code a block in a direct mode, and so determines when or how the 
assumption that motion speed is constant for an input sequence of video data. 
Therefore, the Examiner maintains all prior art rejections based on the combination of 
Lee et al. and Tourapis et al. 
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Claim Rejections - 35 USC § 103 

4. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

5. Claims 1, 5, 10-13, 18-22, 29, and 32-33 are rejected under 35 U.S.C. 103(a) as 
being unpatentable over "Temporally Adaptive Interpolation Exploiting Temporal 
Masking in Visual Perception" (Lee et al.), in view of US Patent Application Publication 
2003/0142748 A1 (Tourapis et al.). Lee et al. teaches a method for dynamically 
determining a Group of Picture (GOP) structure in a video based on temporal 
segmentation. Regarding claim 1, in one embodiment of Lee et al., temporal 
segmentation is determined from a motion compensation error determination (pg. 519: 
column 1), which must inherently use motion vectors to determine a predicted image to 
be compared with an actual image. Lee et al. also incorporates a "typical motion 
compensation encoder" (pg. 514: column 2), which includes a motion estimation unit. 
Then, Lee et al. discloses "computing motion vectors for a plurality of pictures". 

Consider the determination of temporal segmentation based on motion 
compensation error in Lee et al. If the error between an actual frame and a predicted 
frame becomes too great, then it is determined that there is little consistency between 
frames, but if there is a small error, then temporally adjacent frames are considered to 
exhibit consistency. This information is used in a detector that finds a scene 
segmentation point, which is a point at which small changes in a single scene have 
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accumulated past a certain threshold away from a reference frame. The frame 
immediately preceding the scene segmentation point becomes a P frame, and the 
frames in between the last reference frame and the scene segmentation point are 
encoded as B frames (pg. 515: columns 1-2). Then, Lee et al. teaches assigning 
pictures as B pictures based on a consistency measure. However, as discussed in the 
interview of November 15, determining motion compensation error per se is not 
considered the same as determining consistent motion speed. 

Tourapis et al. teaches a video coder that encodes inter macroblocks using 
various modes. In one mode, a "Direct prediction mode", a current macroblock in a B 
picture may be calculated from previously-decoded motion information (paragraph 
0067). Then, the motion for the current picture is just re-used from the previous picture, 
instead of being re-coded and re-transmitted. When motion speed is determined to be 
constant, the motion for the current macroblock is directly taken from the corresponding 
macroblock in a reference frame (paragraph 0068). This determination is known as 
Motion Projection. Then, in Tourapis et al., a constant motion speed is known as a 
measure of consistency between pictures. 

Lee et al. discloses the claimed invention except for determining a picture mode 
from a calculation of consistent motion speed. Tourapis et al. teaches that it was known 
to determine motion compensation mode as a result of a motion projection calculation of 
constant motion. Therefore, it would have been obvious to one having ordinary skill in 
the art to determine a picture mode based on the validity of an assumption of constant 
motion, as taught by Tourapis et al., since Tourapis et al. states in paragraph 01 18 that 
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such a modification would enable a direct mode coding of blocks in B pictures, further 
exploiting temporal redundancy with a current picture and reference pictures. 

Regarding claim 5, the method of Lee et al. could be adjusted to insert 1-3 
default P frames in a GOP to avoid encoding delay (pg. 516, column 2 - pg. 517, 
column 1). For a 16-frame GOP, if 1 P-frame is inserted, for example, no more than 8 
B-frames could be inserted consecutively. Even if no P-frames are inserted by default 
in a GOP, the number of consecutive B-frames is limited by the GOP size of 15 or 16 
frames, since a GOP starts with an l-frame. 

Regarding claims 10-13 and 33, in Lee et al., two kinds of segmentation are 
determined, corresponding with the claimed "termination condition". The first type of 
termination is the determination of a P picture, reached when an accumulated error in 
pictures goes past a certain threshold. This corresponds with a failure in the motion 
projection of Tourapis et al., in which case it is determined that a Direct Mode coding is 
inappropriate. When the threshold is reached, the frame immediately preceding the 
segmentation point becomes a P frame, and the frames in between the last reference 
frame and the scene segmentation point are encoded as B frames (pg. 515: columns 1- 
2). Another segmentation detector determines an abrupt scene change, and encodes 
an I frame at the start of a new scene and a P frame at the end of the previous scene 
(pg. 515, column 1). 

Regarding claim 18, figure 1 of Lee et al. shows a Temporally Adaptive Motion 
Interpolation (TAMI) encoder. This encoder includes a buffer, a conventional MPEG 
encoder, a motion estimation unit, a scene segmentation point (SSP) detector, and a 
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GOP Structure unit (pg. 514, column 2 - pg. 515, column 1). If this GOP Structure Unit 
performs the Motion Projection calculation of Tourapis et al., it corresponds with the 
claimed "colinearity detector". Regarding claim 19, the TAMI unit determines the 
positions of P and B pictures in a GOP (page 514, column 2). Regarding claim 20, as 
mentioned previously, motion projection may be determined from the colinearity of 
motion vectors. Regarding claim 21, the Abrupt Scene Change (ASC) detector 
determines a scene change in an encoded video. Regarding claim 22, as mentioned 
above, at a scene change, an old scene ends with a P-frame and a new scene starts 
with an l-frame. 

Regarding claim 29, in Tourapis et al., figure 6 illustrates a direct mode P 
picture at time t+2, in which the motion vector (dx, dy) for the corresponding block A at 
time t+1 is extended for current block B. This corresponds with the claimed iterative 
method. Regarding claim 32, in Tourapis et al., direct mode blocks have directly 
temporally scaled motion vectors (paragraphs 01 18-01 19). 

6. Claims 2, 6-8, and 17 are rejected under 35 U.S.C. 103(a) as being unpatentable 
over Lee et al. in view of Tourapis et al., as applied to claims 1 and 10 above, in view of 
"Scene-Context Dependent Reference Frame Placement for MPEG Video Coding" (Lan 
et al.), cited in the Information Disclosure Statement filed 12 May 2004. Claim 2 of the 
present application recites encoding the first frame with a variance in motion speed as a 
P-frame. However, in Lee et al., the first frame with a motion inconsistency above a 
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certain threshold is encoded as an l-frame, and the frame immediately previous to this 
point is encoded as a P-frame (pg. 515, column 2). 

Lan et al. teaches a picture-type assignment algorithm in which if the difference 
in accumulated motion between a current frame and a reference frame is above a 
certain value, the current frame is encoded as a P-frame, and becomes the next 
reference frame (pg. 481 , column 2). 

Lee et al., in combination with Tourapis et al., discloses the claimed invention 
except for encoding the first frame that does not follow a frame trend as a P-frame. Lan 
et al. teaches that it was known to encode a significantly changed frame as a P-frame. 
Therefore, it would have been obvious for one having ordinary skill in the art at the time 
the invention was made to encode reference frames as P-frames rather than l-frames 
as taught by Lan et al., since it was well-known in the art that P-frames require less bits 
to be encoded than l-frames. 

Additionally, claims 6 and 17 recite coding some pictures as I pictures for a 
random-access policy. Lee et al. and Tourapis et al. do not teach this limitation. Lan et 
al. teaches an MPEG coding method in which frame type assignment is varied. 
Regarding claims 6 and 17, Lan et al. discloses forcing I frames into a coded video 
sequence every 15 frames to facilitate random access (pg. 486, column 1). Regarding 
claim 7, in Lan et al., whenever an l-frame is encoded, the previous frame is encoded 
as a P-frame (pg. 481, column 1). Regarding claim 8, in Lee et al., P frames can be 
encoded as P1 frames which are regular MPEG P frames, or as P2 frames, which have 
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the same bit allocation as MPEG B frames and are thus coarsely quantized (pg. 514, 
column 2). 

Lee et al., in combination with Tourapis et al., discloses the claimed invention 
except for forcing l-frame encoding. Lan et al. teaches that it was known to encode I- 
frames at regular intervals. Therefore, it would have been obvious to one having 
ordinary skill in the art at the time the invention was made to modify the coding method 
of Lee et al. to insert periodic I frames as taught by Lan et al., since Lan et al. states in 
page 486, column 1 that such a modification would enable random search and pause 
features at playback time. 

7. Claims 3, 4, 14, and 23 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Lee et al. in view of Tourapis et al. as applied to claims 1,12, and 21 , 
in view of US Patent Application Publication 2002/0146071 A1 (Liu et al). Lee et al. 
teaches scene change detection, but always encodes the first picture after the scene 
change as an l-frame and the last picture before the scene change as a P-frame. 

Liu et al. teaches a scene change detection component in a video encoder. In 
Liu et al., a scene change is normally encoded as an l-frame. However, this is not 
always the most efficient coding method. Regarding claims 3, 14, and 23, Figure 10 
shows a scene change between frame 1001 and frame 1002. Frame 1001 was 
originally scheduled to be encoded as an l-frame, but since a scene change 
immediately follows, much computational effort would be wasted in calculating high- 
quality images immediately after the scene change. Then, frame 1001 is instead 



Application/Control Number: 10/743,722 Page 10 

Art Unit: 2621 

encoded as a P-frame, and frames 1002 and 1048 are encoded as low-quality 
predictive frames, since human vision is insensitive to quality changes near a scene 
change (paragraph [0079]), corresponding with the claimed coding of a picture before a 
scene change at full quality or low quality in claim 4. Figure 1 1 gives a further example. 
Here, a scene change occurs immediately preceding a P-frame 1102. Frame 1104, two 
frames before the scene change, was originally scheduled as an l-frame, but instead 
the l-frame is delayed until frame 1110, for which motion vectors have not yet been 
calculated (paragraph [0080]). Finally, figure 13 shows a scene change immediately 
preceding P-frame 1302, which was originally scheduled as an l-frame. However, since 
motion vectors 1304 and 1306 to frame 1302 have already been calculated, the l-frame 
is delayed until frame 1308, originally scheduled to be the next P-frame (paragraph 
[0082]). 

Lee et al., in combination with Tourapis et al., teaches the claimed invention 
except for encoding P-frames immediately surrounding scene changes. Liu et al. 
teaches that it was known to encode a frame immediately preceding or immediately 
following a scene change as a P-frame. Therefore, it would have been obvious to one 
having ordinary skill in the art at the time the invention was made to encode frames 
adjacent to scene changes as P-frames as taught by Liu et al., since Liu et al. states in 
paragraph [0079] that such a modification would increase encoding efficiency by not 
encoding irrelevant data near a scene change, at which time the human eye cannot 
clearly distinguish details of an image. 
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8. Claims 15 and 24 are rejected under 35 U.S.C. 103(a) as being unpatentable 
over Lee et al., in view of Tourapis et al., as applied to claims 10 and 21 above, and in 
further view of "MPEG Video Compression Standard" (Mitchell), cited in the Information 
Disclosure Statement of 17 July 2006. Although in Lee et al., a default picture is 
encoded as a B-frame, Lee et al. does not explicitly state that pictures adjacent to scene 
changes are B-frames. However, Mitchell states that since the eye is insensitive to 
image content near scene changes, image quality can be sacrificed. Regarding claims 
15 and 24, one method of reducing image quality is to start a new scene with B pictures 
(footnote 13). 

Lee et al., in combination with Tourapis et al., discloses the claimed invention 
except for encoding B-frames adjacent to a scene change. Mitchell teaches that it was 
known to encode B-frames immediately following a scene change. Therefore, it would 
have been obvious for one having ordinary skill in the art at the time the invention was 
made to force B-frames immediately following a scene change, as taught by Mitchell, 
since Mitchell states in page 79 that such a modification would reduce the bit rate 
needed to encode a scene change. 

9. Claim 16 is rejected under 35 U.S.C. 103(a) as being unpatentable over Lee et 
al. in view of Tourapis et al. as applied to claim 12 above, and in further view of "Digitale 
Bildcodierung" (Ohm), cited in the Information Disclosure Statement of 17 July 2006. 
Lee et al. teaches scene change detection based on a low correlation between two 
images (pg. 515, column 1), but does not disclose the exact method used. Ohm 
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teaches the Normalized Cross-Correlation Function (NCCF), shown as equation 5.52. 
Regarding claim 16, NCCF is used in many pattern-matching applications, such as 
motion estimation (pg. 1). Two images, x a (m a ,n a ), and y j {m a ,n a ), are compared over 
pixels (m a ,n a ) in area A. This corresponds with images x n (i,j) and x n+l (i,j) in area 
(M, N) in the present invention. Two pictures have the highest match when the NCCF is 
at a maximum (pg. 3), and correspondingly, two pictures have a low match, indicative of 
a scene change, when the value of NCCF is low. 

Lee et al., in combination with Tourapis et al., discloses the claimed invention 
except for the exact method used to determine correlation of two images. Ohm teaches 
that it was known to determine how closely two images match each other with 
Normalized Cross-Correlation. Therefore, it would have been obvious to one having 
ordinary skill in the art at the time the invention was made to determine the correlation 
of two images using NCCF, as taught by Ohm, since Ohm states in page 4 that such a 
modification would allow for a more accurate comparison of the similarity of two images 
rather than by difference levels alone. 

10. Claim 31 is rejected under 35 U.S.C. 103(a) as being unpatentable over Lee et 
al. in view of Tourapis et al., as applied to claim 29 above, and in further view of "Video 
Indexing Using MPEG Motion Compensation Vectors" (Ardizzone et al.) 
Conventionally, a motion vector for a block is defined as the displacement of the block 
between two pictures, velocity is defined as displacement over time, and speed is 
defined as the magnitude of velocity. However, while two-dimensional displacement is 
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normally given with the Euclidian distance metric, the square root of the sum of the 
squares of the x and y components, in claim 31, displacement is given with the 
Manhattan distance metric, the sum of the x and y components. Ardizzone et al. 
teaches a method for spatially segmenting an MPEG image with motion vectors (pg. 
725, columns 1-2). In one step of Ardizzone et al., magnitudes of the motion vectors 
are built into a histogram to determine "dominant" regions of the image (pg. 727, column 
2). If a motion vector has a large magnitude, this means that its macroblock is 
displaced a large distance, and so has a high speed. An experiment was performed to 
determine how best to retrieve related images to a given image, by matching motion 
vector characteristics (pg. 728, column 2 - pg. 729, column 1). Regarding claim 31, 
using a Manhattan distance metric yielded the best result (pg. 729, column 1). 
Lee et al. discloses the claimed invention except for defining pixel block displacement 
with a Manhattan distance metric. Ardizzone et al. teaches that it was known to 
calculate motion vector magnitude with Manhattan distance. Therefore, it would have 
been obvious to one having ordinary skill in the art at the time the invention was made 
to determine motion speed of an image based on the Manhattan distance metric, as 
taught by Ardizzone et al., since Ardizzone et al. states in page 729, column 1, that 
such a modification would produce the greatest accuracy in characterizing the motion 
vectors of the image. 
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Conclusion 

1 1 . THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time 
policy as set forth in 37 CFR 1 .136(a). 

A shortened statutory period for reply to this final action is set to expire THREE 
MONTHS from the mailing date of this action. In the event a first reply is filed within 
TWO MONTHS of the mailing date of this final action and the advisory action is not 
mailed until after the end of the THREE-MONTH shortened statutory period, then the 
shortened statutory period will expire on the date the advisory action is mailed, and any 
extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of 
the advisory action. In no event, however, will the statutory period for reply expire later 
than SIX MONTHS from the mailing date of this final action. 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to David N. Werner whose telephone number is (571)272- 
9662. The examiner can normally be reached on Monday-Friday from 10:00-6:30. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Mehrdad Dastouri can be reached on (571) 272-7418. The fax phone 
number for the organization where this application or proceeding is assigned is 571- 
273-8300. 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
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For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 

you have questions on access to the Private PAIR system, contact the Electronic 

Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a 

USPTO Customer Service Representative or access to the automated information 

system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 

/Marsha D. Banks-Harold/ 

Supervisory Patent Examiner, Art Unit 2621 

ID. N. W./ 

Examiner, Art Unit 2621 



